1. Using modelflow with World Bank models

The Modelflow python package has been developed to solve a wide range of models, see the modelflow github web site for working examples of the Solow Model, the FR/USB model and others.

The package has been substantially expanded to include special features that enable it to work with World Bank models originally developed in EViews and designed to use EViews Model Object for simuation.

This chapter illustrates how to access these models, how to load them into a modelflow anaconda environment on your computer and how to perform a variety of simulations

1.1. Accessing a world bank model

At this time several World bank macrostructural models are available to download and use with modelflow. These include a macrostructural model for:

  • Indonesia

  • Nepal

  • Croatia

  • Iraq

  • Kenya

  • Bolivia

Each of these models has been developed as part of the outreach work of the World Bank. The basic modelling framework of each of these models is outlined in Burns et al. [2019] with specific extensions reflecting features of the individual country modelled.

This book uses as an example a climate aware model for Pakistan developed in 2020 and described in Burns et al. [2021] .

The World Bank models are distributed in the pcim file format of the modelflow and can be downloaded by right clicking on the links above. The Pakistan model can be downloaded here by right clicking on the above link and selecting Save Link as and placing the file on a directory accessible by your modelflow installation.

from worldbankMFModModels import pak

1.2. Preparing your python environment

As always, the modelflow and other python packages that will be used need to be imported into your python session. The examples here and this book were written and solved in a Jupyter Notebook. There are some Jupyter specific commands included in these examples and these are annotated. However, the bulk of the content of the programs can be run in other environments, including Interactive Development Environments (IDE) like Spyderor MS Visual Code. All the programs have been tested under spyder as well as Jupyter Notebook.

It is assumed that:

  1. you have already installed modelflow and its various support packages following the instructions in Chapter xx

  2. you are using Anaconda, and that

  3. you have activated your modelflow environment by executing the following command from your python command line:

conda activate modelflow

where modelflow is the name you have given to the conda environment into which you installed modelflow.

# import the model class from modelflow package
from modelclass import model 
import modelmf       # Add useful features to pandas dataframes 
                     # using utlities initially developed for modelflow

   

model.widescreen()   # These modelflow commands ensure that outputs from modelflow play well with Jupyter Notebook
model.scroll_off()

Ib : next text has been split out and hidden in JB in the metadata

1.3. Working with PakMod under modelflow

The basic method for working with any model is the same. Indeed the initial steps followed here are the same as were followed during the simple model discussion.

Process:

  1. Prepare the workspace

  2. Load the model Modelflow

  3. Design some scenarios

  4. Simulate the model

  5. Visualize the results

1.3.1. Load a pre-existing model, data and descriptions

Ib : some changes

To load a model use the model.modelload() method of modelflow.

The command below

mpak,bline = model.modelload(''../models/pak.pcim',
                             alfa=0.7,run=1,keep='Baseline'', 

instantiates (creates an instance of) a modelflow model object and assigns it to the variable name mpak. The run=1 option executes the model and assigns the result of the model execution to the dataframe bline. The model is solved with the parameter alfa set to 0.7. The \(alfa \in (0,1)\) parameter determines the step size of the solution engine. The larger alfa the larger the step size. Larger step sizes may solve faster, but may have trouble finding a unique solution. Smaller step sizes take longer to solve but are more likely to find a unique solution. Values of alfa=.7 work well for World Bank models.

The keep option instructs modelflow to maintain in the model object (mpak) the results of the intitial scenario, assigning it the text name Baseline.

Ib : As it is now mpak will first look in the location specified, then it will look in the Global model repo.

#Replace the path below with the location of the pak.pcim file on your computer
mpak,bline = model.modelload('../models/pak.pcim',
                             alfa=0.7,run=1,keep='Baseline')
file read:  C:\modelflow manual\papers\mfibbook\content\models\pak.pcim

Note

the variable bline contains the dataframe with the results of the simulation. This is distinct from the data that is stored by the kept= command. That said, the data associated with each, while stored separately, have the same numerical values. The keep option is described in more detail toward the end of this section.

Box [^BoxWBMnemonics]: World Bank Mnemonics

A typical World Bank model will have in excess of 300 variables. Each has a mnemonic that is structured in a specific way, The root for almost all are 14 characters long (some special variables have additional characters appended to this root) (see discussion in section).

\[\texttt{12345678901234}\]
\[\color{green}{\texttt{CCC}}\color{red}{\texttt{AA}}\color{lime}{\texttt{MMM}}\color{blue}{\texttt{NNNN}}\color{magenta}{\texttt{U}}\color{black}{\texttt{C}}\]

where:

Letters

Meaning

\(\color{green}{\texttt{CCC}}\)

The three-leter ISO code for a country – i.e. IDN for Indonesia, RUS for Russia

\(\color{red}{\texttt{AA}}\)

The two-letter major accounting system to which the variable attaches,

\(\color{lime}{\texttt{MMM}}\)

The three-letter major sub-category of the data - i.e. GDP, EXP - expenditure

\(\color{blue}{\texttt{NNNN}}\)

The four-letter minor sub-category MKTP for market prices

\(\color{magenta}{\texttt{U}}\)

The measure (K: real variable;C: Current Values; X: Prices)

\(\color{black}{\texttt{C}}\)

denotes the Currency (N: National currency; D: USD; P: PPP)

Common major accounting systems mnemonics: the, \(\color{red}{\texttt{AA}}\)s from above:

Code

Meaning

NY

National income

NE

National expenditure Accounts

NV

Value added accounts

GG

General Government Accounts

BX

Balance of Payments: Exports

BM

Balance of Payments: Imports

BN

Balance of Payments: Net

BF

Balance of Payments: Financial Account

Thus

Mnemonic

Meaning

IDNNYGDPMKTPKN

Indonesia GDP at market prices, real in Indonesian Rupiah

KENNECPNPRVTXN

Kenya Private (household) consumption expenditure schillings deflator

BOLGGEXPGNFSCN

Bolivia Government Expenditure on Goods and services (GNFS) in current Bolivars

HRVGGREVDCITCN

Croatia Government Revenues Direct Corporate Income Taxes in current Euros

NPLBXGSRNFSVCD

Nepal BOP Exports of non-factor services (goods and services) in current USD

1.3.2. Extracting information about the model

The newly loaded python object mpak is an instance of the model class and as such inherits the methods (functions) and properties (data) of that class. To learn about the model there are a variety of information methods that can be used to extract information about the model and its data.

1.3.3. Information about a specific variable

Method

Example

Information returned

.<var name>

modelname.PAKNECONPRVTXN

The equation (formula), variable descriptions and data

.<var name>.frml

modelname.PAKNECONPRVTXN.frml

The equation (formula) and variable descriptions

.<var name>.show

modelname.PAKNECONPRVTXN.show

The equation (formula), variable descriptions variable values

mpak.PAKNECONOTHRXN.frml
Endogeneous: PAKNECONOTHRXN: 
Formular: FRML <DAMP,STOC> PAKNECONOTHRXN = (PAKNECONOTHRXN(-1)*EXP(PAKNECONOTHRXN_A+ (0.590372627657176*((LOG(PAKNYGDPFCSTXN))-(LOG(PAKNYGDPFCSTXN(-1))))+((PAKGGREVGNFSXN/100)-(PAKGGREVGNFSXN(-1)/100))+(1-0.590372627657176)*((LOG(PAKNEIMPGNFSXN))-(LOG(PAKNEIMPGNFSXN(-1))))+0.2*PAKNYGDPGAP_/100) )) * (1-PAKNECONOTHRXN_D)+ PAKNECONOTHRXN_X*PAKNECONOTHRXN_D  $

PAKNECONOTHRXN  : 
PAKGGREVGNFSXN  : Goods and services Tax Rate
PAKNECONOTHRXN_A: Add factor:PAKNECONOTHRXN
PAKNECONOTHRXN_D: Fix dummy:PAKNECONOTHRXN
PAKNECONOTHRXN_X: Fix value:PAKNECONOTHRXN
PAKNEIMPGNFSXN  : Imp., GNFS (NIA), LCU Price defl. 2000 = 1
PAKNYGDPFCSTXN  : GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPGAP_    : Output Gap (% of Potential GDP)

1.3.4. Information about a number of variables that meet certain search criteria

Often it can be useful to view information, to visualize or to do calculations on a number of variables in a model. This can be done by selecting a number of variable applying pythons index operator [] on a model and then performing operations based on the selected variables.

1.3.4.1. Selecting variables that meet certain search criteria

One or more variables can be selected by specifying modelname.['\<search pattern>'] The pattern can be used to match variables based on:

Matching variable:

Prefix

Example

Comment

Names

mpak['PAKNECON*XN PAKNECONPRVT?N']

list and wildcards allowed

Descriptions

!

mpak.['!*GDP\*']

list and wildcards allowed

Group

#

modelname.['#Headline']

only one pattern allowed

The * character in the command mpak['PAKNECON*XN'].names is a wildcard character and the expression will return all variables that begins with PAKNECON and ends with XN.

The ? is another wildcard character. It will match only single characters. Thus mpak['PAKNECONPRVT?N'].names would return three variables: PAKNECONPRVTKN, PAKNECONPRVTXN, and PAKNECONPRVTXN. The real, current value, and deflators for household consumption expenditure.

After the variables have been selected a number of operations can be performed based on the selected variables. In the following a few of the operations are explained. There are many more, they can be viewed here - a reference to be inserted

Note

The variable descriptions are contained in a model class property named .var_description. This is a dictionary

The variable groups are contained in a model class property named .var_groups.This is also a dictionary

The user can edit these if it is useful.

1.3.4.2. Operating on selected variables

Method

Information returned

.des

Variable descriptions whose variable name matches

.names

Variable names which match

.frml

Returns a normalized version of the equation (the one actually used in modelflow)

.eviews

In models imported from Eviews, reports the original eviews specification

Note

The .eviews command returns the equations before they were normalized. In most cases this is a slightly more legible form. Here following the EViews syntax, \(\Delta ln()\) is written as dlog().

mpak['!*GDP*'].des
PAKBNCABFUNDCD_       : Current Account Balance (% of GDP)
PAKGDPPCKD            : GDP per capita, 2000 US$ mn
PAKGDPPCKN            : GDP per capita, 2005 LCU mn
PAKNYGDPDISCCN        : GDP Disc., LCU mn
PAKNYGDPDISCKN        : GDP Disc., 2000 LCU mn
PAKNYGDPFCSTKN        : GDP Factor Cost Local Currency units Volumes National base year
PAKNYGDPFCSTXN        : GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPFCSTXN_A      : Add factor:GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPFCSTXN_D      : Fix dummy:GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPFCSTXN_FITTED : Fitted  value:GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPFCSTXN_X      : Fix value:GDP Factor Cost Local Currency units Implicit Price deflator
PAKNYGDPGAP_          : Output Gap (% of Potential GDP)
PAKNYGDPMKTPCD        : GDP, Market Prices, US$ mn
PAKNYGDPMKTPCN        : GDP, Market Prices, LCU mn
PAKNYGDPMKTPKD        : GDP, Market Prices, 2000 US$ mn
PAKNYGDPMKTPKN        : Real GDP
PAKNYGDPMKTPXN        : GDP, Marker Prices, LCU Price defl., 2000 = 1
mpak['PAKNECON*XN'].eviews
PAKNECONENGYXN : DLOG(PAKNECONENGYXN) = DLOG(PAKNECONENGYGN) + 0.0550557534575806*DUMH
PAKNECONGOVTXN : DLOG(PAKNECONGOVTXN) =- 0.3*(LOG(PAKNECONGOVTXN( - 1)) - LOG(PAKNECONPRVTXN( - 1))) + 0.0752362082122748 + 0.5*DLOG(PAKNECONGOVTXN( - 1)) + (1 - 0.5)*DLOG(PAKNECONPRVTXN)
PAKNECONOTHRXN : DLOG(PAKNECONOTHRXN) = 0.590372627657176*DLOG(PAKNYGDPFCSTXN) + D(PAKGGREVGNFSXN/100) + (1 - 0.590372627657176)*DLOG(PAKNEIMPGNFSXN) + 0.2*PAKNYGDPGAP_/100
PAKNECONPRVTXN : @IDENTITY PAKNECONPRVTXN  = ((PAKNECONENGYSH^PAKCESENGYCON)  * PAKNECONENGYXN^(1  - PAKCESENGYCON)  + (PAKNECONOTHRSH^PAKCESENGYCON)  * PAKNECONOTHRXN^(1  - PAKCESENGYCON))^(1  / (1  - PAKCESENGYCON))
mpak['PAKNECON*XN'].frml
PAKNECONENGYXN : FRML <DAMP,STOC> PAKNECONENGYXN = (PAKNECONENGYXN(-1)*EXP(PAKNECONENGYXN_A+ (((LOG(PAKNECONENGYGN))-(LOG(PAKNECONENGYGN(-1))))+0.0550557534575806*DUMH) )) * (1-PAKNECONENGYXN_D)+ PAKNECONENGYXN_X*PAKNECONENGYXN_D $
PAKNECONGOVTXN : FRML <DAMP,STOC> PAKNECONGOVTXN = (PAKNECONGOVTXN(-1)*EXP(PAKNECONGOVTXN_A+ (-0.3*(LOG(PAKNECONGOVTXN(-1))-LOG(PAKNECONPRVTXN(-1)))+0.0752362082122748+0.5*((LOG(PAKNECONGOVTXN(-1)))-(LOG(PAKNECONGOVTXN(-2))))+(1-0.5)*((LOG(PAKNECONPRVTXN))-(LOG(PAKNECONPRVTXN(-1))))) )) * (1-PAKNECONGOVTXN_D)+ PAKNECONGOVTXN_X*PAKNECONGOVTXN_D $
PAKNECONOTHRXN : FRML <DAMP,STOC> PAKNECONOTHRXN = (PAKNECONOTHRXN(-1)*EXP(PAKNECONOTHRXN_A+ (0.590372627657176*((LOG(PAKNYGDPFCSTXN))-(LOG(PAKNYGDPFCSTXN(-1))))+((PAKGGREVGNFSXN/100)-(PAKGGREVGNFSXN(-1)/100))+(1-0.590372627657176)*((LOG(PAKNEIMPGNFSXN))-(LOG(PAKNEIMPGNFSXN(-1))))+0.2*PAKNYGDPGAP_/100) )) * (1-PAKNECONOTHRXN_D)+ PAKNECONOTHRXN_X*PAKNECONOTHRXN_D $
PAKNECONPRVTXN : FRML <IDENT> PAKNECONPRVTXN = ((PAKNECONENGYSH**PAKCESENGYCON)*PAKNECONENGYXN**(1-PAKCESENGYCON)+(PAKNECONOTHRSH**PAKCESENGYCON)*PAKNECONOTHRXN**(1-PAKCESENGYCON))**(1/(1-PAKCESENGYCON)) $
mpak['PAKNECONPRVT?N'].names
['PAKNECONPRVTCN', 'PAKNECONPRVTKN', 'PAKNECONPRVTXN']
mpak['#Headline'].des
PAKGDPPCKN     : GDP per capita, 2005 LCU mn
PAKGGBALEXGRCN : Government Balance, excl. grants

1.3.4.3. The matching pattern can be a list

When matching names and descriptions the pattern can be a list

mylist=['PAKNECONPRVTKN','PAKNECONGOVTKN','PAKNEGDIFTOTKN','PAKNEEXPGNGSKN','PAKNEIMPGNFSKN']
mpak[mylist].des
PAKNECONPRVTKN : HH. Cons Real
PAKNECONGOVTKN : Gov. Cons real
PAKNEGDIFTOTKN : Investment real
PAKNEIMPGNFSKN : Imports real

1.4. Behavioural equations in the MFMod framework

Recall a behavioural equation determines the value of an endogenous variable. For many of the variables in Wold Bank models, behavioural functions are estimated using an Error Correction Framework that splits the equation into a theoretically determined long run component and a more idiosyncratic short-run component.

Looking at the eviews representation of the consumption function:

DLOG(PAKNECONPRVTKN) =- 0.2*(LOG(PAKNECONPRVTKN( - 1)) - LOG(1.21203101101442) - LOG((((PAKBXFSTREMTCD( - 1) - PAKBMFSTREMTCD( - 1))*PAKPANUSATLS( - 1)) + PAKGGEXPTRNSCN( - 1) + PAKNYYWBTOTLCN( - 1)*(1 - PAKGGREVDRCTXN( - 1)/100))/PAKNECONPRVTXN( - 1))) + 0.763938860758873*DLOG((((PAKBXFSTREMTCD - PAKBMFSTREMTCD)*PAKPANUSATLS) + PAKGGEXPTRNSCN + PAKNYYWBTOTLCN*(1 - PAKGGREVDRCTXN/100))/PAKNECONPRVTXN) - 0.0634474791568939*@DURING("2009") - 0.3*(PAKFMLBLPOLYXN/100 - DLOG(PAKNECONPRVTXN))

Below the mnemonics are simplified to ease reading of the equation using:

Model Mnemonic

Simplified

Meaning

PAKNECONPRVTKN

\(CON^{KN}_t\)

Household Consumption

(PAKBXFSTREMTCD - PAKBMFSTREMTCD)*PAKPANUSATLS

\(Remit^{net}_t\)

Net remittances inflows in LCU

PAKGGEXPTRNSCN

\(TRANSF^{hhld}_t\)

Government transfers to households

DURING_2010

\(D^{2010}_t\)

A dummy

PAKFMLBLPOLYXN

\(r^{policy}_t\)

Policy Rate

PAKGGREVDRCTXN

\(DirectTxR_t\)

Direct Taxes: Effective rate

PAKNECONPRVTKN_A

\(CON^{KN_AF}_t\)

Add factor:Household Consumption

PAKNECONPRVTXN

\(CON^{XN}_t\)

Household Consumption Deflator

PAKNYYWBTOTLCN

\(WAGEBILL^{CN}_t\)

Economy-wide wage bill

\[\begin{align*} \Delta log(CON^{KN}_t) = &-0.2*\bigg[LOG(CON^{KN}_{t-1})-LOG\bigg({\frac{(Remit^{net}_{t-1}+WAGEBILL^{CN}_{t-1}+TRANSF^{hhld}_{t-1})*(1-DirectTxR_{t-1}/100)}{CON^{XN}_{t-1}}}\bigg)\bigg] \\ &+0.76*\Delta log \bigg({\frac{(Remit^{net}_{t}+WAGEBILL^{CN}_{t}+TRANSF^{hhld}_{t})*(1-DirectTxR_{t}/100)}{CON^{XN}_{t}}}\bigg) \\ &+0.030 + 0.016*D^{2010}_t-0.3*\bigg(r^{policy}_t/100-\Delta log(CON^{XN}_{t})\bigg) -CON^{KN_AF}_t \end{align*}\]

Where in this instance the short-run elasticity of consumption to disposable income is .76 , and the short run elasticity of consumption to the real interest rate is 0.3.

1.4.1. The ECM specification

Pretty sure this repeats and earlier section. Delete one

The ECM approach used in World Bank models is described in [Wickens and Breusch, 1988], and addresses the above challenge by modelling both the long run relationship and the short run short run behaviour and brings them together into one equation.

The ECM specification is therefore a single equation comprised of two parts (the long run relationship, and the short-run relationship).

Consider as an example two variables say consumption and disposable income. Both have an underlying trend or in the parlance are co-integrated to degree 1. For simplicity we call them y an x.

1.4.1.1. The short run relationship

In its simplest form we might have a short run relationship between the growth rates of our two variables such that:

\[\Delta log(Y_t) = \alpha + \beta \Delta log(X_t) +\epsilon_t\]

or substituting lower case letters for the logged values.

\[\Delta y_t = \alpha + \beta \Delta x_t +\epsilon_t\]

1.4.1.2. The long run equation

The long run relates the level of the two (or more) variables. We can write a simple version of that equation as:

\[Y_t=αX_t^β+ \eta_t\]

Ib : shift from log to ln here

Rewriting this (in logarithms) it can be expressed as:

\[y_t = ln⁡(α) + βy_t + \eta_t\]

1.4.2. The long run equation in the steady state

First we note that in the steady state the expected value of the error term in the long run equation is zero (\(\eta_t=0 \)) so in those conditions we can simplify the long run relationship to:

\[y_t=ln⁡(α)+\beta x_t\]

or equivalently (substituting A for the log of \(\alpha\)).

\[y_t-A-βx_t=0\]

Moreover if we multiplied this by some arbitrary constant say \(-\lambda\) it would still equal zero.

\[-\lambda(y_t -A-βx_t)\]

and in the steady state this will also be true for the lagged variables

\[-\lambda(y_{t-1-} A - βx_{t-1})\]

Ib : small change to

\[-\lambda(y_{t-1} - A - βx_{t-1})\]

1.5. Putting it together

From before we have the short run equation:

\[\Delta y_t = \alpha + \beta \Delta x_t +\epsilon_t\]

Inserting the steady state expression into the short run equation makes no difference (in the long run) because in the long run it is equal to zero.

\[\Delta y_t = -\lambda(y_{t-1-}A-βx_{t-1}) + \alpha + \beta \Delta x_t +\epsilon_t\]

Ib : small change to

\[\Delta y_t = -\lambda(y_{t-1}-A-βx_{t-1}) + \alpha + \beta \Delta x_t +\epsilon_t\]

When the model is not in the steady state the expression \(y_{t-1}-A-βx_{t-1}\) is of course the error term from the long run equation (a measure of how far the dependent variable is from equilibrium).

1.5.1. Lambda, the speed of adjustment

The parameter \(\lambda\) can be interpreted as the speed of adjustment. As long as \(\lambda\) is greater than zero and less or equal to one if there are no further disturbances ( \(\epsilon_t=0\)) the expression multiplied by lambda will slowly decline toward zero. How fast depends on how large or small is \(\lambda\).

To be convergent \(\lambda\) must be between 0 and 2, if its is negative or greater than one, then the long run portion of the equation will cause the disequilibrium to grow each period (\(\lambda\) >1) not diminish or if (\(\lambda\) >1<2) output will oscillate from positive to negative (\(\lambda <0\)).

Intuitively, the long run error term measures how far we are from equilibrium one period earlier (at t-1). The ECM term ensures that we will slowly converge to equilibrium – the point at which the long run equation holds exactly. If $\(\lambda\)\( is greater than zero but less than one (or equal to one) some portion of the previous period year's disequilibrium will be absorbed each year. How much is absorbed depends on the size of estimated speed of the adjustment coefficient \)\lambda$.

Looking at an ECM equation we can then break it up into its component parts. For the consumption function it will look something like this:

\[\Delta c_t = -\lambda (\underbrace{ log(C_{t-1})-log(Wages_{t-1}-Taxes_{t-1}+Transfers_{t-1} + \alpha)} _\text{Long run} +\beta \underbrace{\Delta y_t}_\text{short run}\]